"AI Models Are Lying to Us" Here's the AI Research Lab Trying to Solve This | APOLLO RESEARCH

Update: 2025-10-16

Description

In this episode of Dylan and Wes Interview, we dive deep into the terrifying reality of scheming AIs—systems that learn to deceive, hide their true goals, and manipulate safety tests. Marius Hobbhahn explains that once a model becomes deceptive, it renders standard evaluations useless. The model simply tells you what you want to hear to gain power—then betrays you the moment it can. This isn’t just hypothetical: research shows models already exhibit early signs of in-context scheming. If safety checks can be faked, the stakes go way up. Spotting deception early might be the last safeguard we get.

Comments

In Channel

Can Grok and Claude run a business? We just did it

2025-12-2901:28:44

Avi Loeb reveals the truth about 3I/ATLAS

2025-12-2201:01:08

China Just Popped America's AI Bubble: Cyrus Janssen Reveals What Happens Next!

2025-12-0301:02:30

AI Safety Expert: All Jobs Gone by 2027 - Dr. Roman Yampolskiy

2025-11-2801:33:59

1,000 days left until the "Final Collapse" | Emad Mostaque

2025-11-1201:18:35

What Happens When AIs Learn Politics and Deception?

2025-10-2301:21:34

"AI Models Are Lying to Us" Here's the AI Research Lab Trying to Solve This | APOLLO RESEARCH

2025-10-1601:20:48

he just vibe coded a million dollar startup

2025-10-0701:05:25

if anyone builds it, everyone dies

2025-10-0201:00:32

Is AI replacing coders?

2025-09-3001:19:50

Alex (Ticker Symbol U)

2025-09-1801:56:36

AI will crush Hollywood

2025-09-1701:01:51

How I use AI to create viral videos from a remote island

2025-09-1101:20:40

The Secret to Creating Videogames With AI | Vibe Coding Entire Games From Scratch with Max Hertan

2025-09-0901:24:13

Julia McCoy - AI Avatars, Escaping Cults, Age of Abundance and the YouTube Roller Coaster

2025-08-2856:42

Nick Bostrom - Superintelligence, Deep Utopia, Human Purpose and Understanding Consciousness

2025-08-2201:04:56

AI Village is getting scary good, AI Agents 2x every 4 months | Adam Binksmith from AI Digest

2025-08-2001:07:40

Matt Wolfe - Superintelligence, AI Recommends Murder, Surveillance States & AI Disrupting Hollywood.

2025-08-1701:48:05

AI Labs and Pirated Data, BILLIONS in Liability and Deepfakes | Prof. Christa Laser Explains

2025-08-0102:18:33

Robotic Gods, Limitless Tyrants, Illusion of Reality, p(doom) and the Tech Utopia w/ VegasLowRoller

2025-07-2902:08:25

00:00

"AI Models Are Lying to Us" Here's the AI Research Lab Trying to Solve This | APOLLO RESEARCH

Wes Roth and Dylan Curious

#box-pro-ellipsis-176748839219995{-webkit-line-clamp:2;}"AI Models Are Lying to Us" Here's the AI Research Lab Trying to Solve This | APOLLO RESEARCH

"AI Models Are Lying to Us" Here's the AI Research Lab Trying to Solve This | APOLLO RESEARCH

Wes Roth and Dylan Curious

"AI Models Are Lying to Us" Here's the AI Research Lab Trying to Solve This | APOLLO RESEARCH